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Abstract 



During the course of the last decade, travehng wave accel- 
erating structures for a future Linear Collider have been the 
object of intense R&D efforts. An important problem is the 
efficient computation of the long range wakefield with the 
ability to include small alignment and tuning errors. To 
that end, SLAC has developed an RF circuit model with 
a demonstrated ability to reproduce experimentally mea- 
sured wakefields. The wakefield computation involves the 
repeated solution of a deterministic system of equations 
over a range of frequencies. By taking maximum advan- 
tage of the sparsity of the equations, we have achieved sig- 
nificant performance improvements. These improvements 
make it practical to consider simulations involving an en- 
tire linac lO'' structures). One might also contemplate 
assessing, in real time, the impact of fabrication errors on 
the wakefield as an integral part of quality control. 

1 INTRODUCTION 

During the course of the last decade, SLAC has been con- 
ducting R&D on new generations of accelerating structures 
for a future machine, the Next Linear Collider (NLC). The 
culmination of this work is the Damped Detuned Struc- 
ture (DDS). Since it is difficult to dissipate deflecting mode 
power without also dissipating accelerating mode power, 
this structure achieves high efficiency (shunt impedance) 
by relying primarily on detuning to produce favorable phas- 
ing of the dipole modes to mitigate the dipole sum wake. To 
prevent the partial re-coherence of the long range wake, a 
small amount of damping is provided by extracting dipole 
mode energy through four manifolds which also serve as 
pumping slots. 

A linear collider is a complex system and detailed nu- 
merical simulations are essential to understand the impact 
of different random and/or systematic structure fabrication 
errors on beam quality. Assuming a (loaded) gradient of 
50 MV/m and a length of 2 m, each of the two arms of a 1 
TeV in the center-of-mass NLC would be comprised of ap- 
proximately 1000 structures. To simulate the effect of fab- 
rication errors on emittance growth, one needs to compute 
one wake per structure; consequently, there is considerable 
interest in performing these computations as efficiently as 
possible. A typical NLC structure comprises 206 cells. Be- 
cause of the large number of nodes, it impractical to re- 
sort to standard finite element or finite difference codes to 
compute the wake. To make computations manageable, the 
SLAC group has developed an RF circuit model. Despite 
its limitations, predictions have proven to be in remarkable 



agreement with experimental results. However, until now, 
the wake computations remained too slow to make the sim- 
ulation of a full linac practical. In this paper, we describe 
algorithmic modifications that have led to a code achieving 
three orders of magnitude improvement over previously re- 
ported performance. 

2 CIRCUIT MODEL FOR DDS 

In an RF circuit model. Maxwell's equations are discretized 
using a low order expansion based on individual closed cell 
modes. The result is a system of linear equations that can 
conveniently be represented by a circuit where voltages and 
currents are associated with modal expansion coefficient 
amplitudes. A model suitable for the computation of the 
fields excited by the dipole excitation of a detuned struc- 
ture was developed by Bane and Gluckstern [|l]]. The con- 
cept of manifold damping was later introduced by Kroll |Q] 
and the circuit model was extended by the SLAC group to 
include this feature [||]. The result is shown in Figure I. 
The corresponding equations can be put in the form 
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Figure 1: Circuit model for Damped Detuned Structures. 
The thick horizontal lines represent a transmission line. 
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(1) 

where / is the frequency and / is a unit diagonal. The 
submatrices H, H^, G and R aie N x N where N is the 
number of cells ( N = 206 for the SLAC structure). H 
and H describe respectively the TMno-like and TEm-like 
cell mode coupling, Hx represents the TE - TM cross cou- 
pling, R describes the manifold mode propagation and G 
describes the TE-to-manifold coupling. The vectors a, d 



are the normalized loop currents {a = i/ \/C^) for the TM 
and TE chains and V is the normalized manifold voltage 
at each cell location. Finally, the right hand side b repre- 
sents the beam excitation. Since the boundary conditions at 
the cell interfaces impose that the TM and TE components 
must propagate in opposite directions, only the TM cell 
modes are excited by the beam. The dipole mode energy is 
coupled out electrically to the manifold via small slots; the 
TE component of the field is therefore capacitively coupled 
to the manifold. Note that the manifold is represented by a 
periodically loaded transmission line for which only nodal 
equations make sense, resulting in a mixed current-voltage 
formulation. 

3 SPECTRAL FUNCTION 

Computing the wake of DDS structures involves solv- 
ing ([l|) over the structure's dipole mode frequency band- 
with. A longitudinal dipole impedance is first obtained 
by summing the cell voltages (in the frequency domain) 
with appropriate time delays. The transverse impedance 
is subsequently derived by invoking the Panofsky-Wentzel 
theorem. The circuit approach to wake computation in- 
troduces a small non-causal, non-physical component to 
the wake w{t) which can be suppressed by considering 
[w{t) — w{—t)]u-i{t) instead. The sine transform of 
this function, proportional to the imaginary part of the 
impedance, is known as the spectral function S{lu). In 
the context of circuit-based wake computations, S{u!) is 
a more convenient quantity to compute than the dipole 
(beam) impedance. 

4 SPARSE LINEAR EQUATIONS 

In the DDS circuit model, each cell couples only to its near- 
est neighbors. The resulting matrix is sparse and com- 
plex symmetric (a consequence of electromagnetic reci- 
procity). Computing the spectral function involves solv- 
ing a sequence systems of linear equations. At each step 
in frequency, the coefficient matrix changes slightly while 
its sparsity structure remains identical. In addition, a good 
starting approximation to the solution for any frequency 
step is provided by the solution from the previous step. 

4. 1 Iterative Methods 

An algorithm suitable for symmetric complex systems 
is the so-called Quasi Minimal Residual (QMR) algo- 
rithm This algorithm is a relative of the well-known 
conjugate gradient method which seeks to minimize the 
quadratic form [Ax — h)^ ■ [Ax — h). The QMR algo- 
rithm minizimizes a different quadratic form; in both cases 
the key to rapid convergence is suitable "preconditioning" 
of the system Ax = b with an approximate and easy to 
compute inverse. Tests were performed with RDDS circuit 
matrices using standard incomplete factorization precondi- 
tioners; but the results were somewhat disappointing. It 
is believed that with a suitable preconditioner, the method 



can be competitive; however, efforts to identify one were 
abandoned after a direct technique proved to be more than 
satisfactory. 

4.2 Direct Methods 

Direct algorithms are essentially all relatives of the elemen- 
tary Gaussian elimination algorithm, where unknowns are 
eliminated systematically by linear combinations of rows. 

A crucial point is that the order in which the rows of 
the matrix are eUminated has a direct impact on com- 
putational efficiency since a different order implies dif- 
ferent fill-in patterns []. In principle, there exists an elim- 
ination order that minimizes fill-in, which is not the same 
as the most numerically stable ordering. In some cases, it 
is even possible to find an ordering that produces no fill- 
in at all. Although the determination of a truly optimal 
ordering is an NP-complete problem, it is possible using 
practical strategies to find orderings that result in signifi- 
cant computational savings. The most successful class of 
ordering strategies are so-called "local" strategies that seek 
to minimize fill-in at each step in the elimination process 
regardless of their impact at a later stage. 

The Markowitz Algorithm A good local ordering 
strategy is the Markowitz algorithm. Suppose Gaussian 
elimination has proceeded through the first k stages. For 
each row i in the active (n — fc) x (n — k) submatrix, let 

(k) (k) 

rl ' denote the number of entries. Similarly, let c!j ' be the 
number of entries in column j. The Markowitz criterion is 
to select as pivot the entry a[j^ from the (n — k) x {n — k) 
submatrix that satisfies 

min(rf ) ~ l)(cf ^ - 1) (2) 

Using this entry as the pivot causes [rf^^ — l)(Cj'^'' — 1) 
entry modifications at step k. Not all these modifications 
will result in fill-in; therefore, the Markowitz criterion is 
actually an approximation to the choice of pivot which in- 
troduces the least fill-in. 

5 CODE DESCRIPTION 

Our code is based on the spectral function method and uses 
Markowitz ordering to solve the circuit equations in the fre- 
quency domain. Compared to the procedure outlined in [||], 
the following changes have been made: (1) The manifold 
voltage A is not separately eliminated, in order to preserve 
sparsity. (2) Once the system (|l]) is solved, the loop cur- 
rents are known and the cell voltages can be obtained by a 
simple matrix multiplication. There is therefore no need 
to form an inverse[||]. 

'The elimination process creates non-zero entries at positions wliich 
con'espond to zeros in the original coefficient matrix. The fill-in is the set 
of all entries which were originally zeros and took on non-zeros value at 
any step of the elimination process. 



Two additional remarks are in order. The process of de- 
termining the Markowitz ordering can by itself be time- 
consuming; however, since the structure of the RDDS 
matrix remains the same at every step in frequency, the 
ordering needs to be determined only once. The relative 
magnitudes of the equivalent circuit matrix entries do not 
change very significantly over the frequency band occupied 
by the dipole modes. This insures that the Markowitz 
ordering remains numerically stable for all frequency 
steps. 

Implementations of the Markowitz algorithm are widely 
available. We used SPARSE [|[|, a C implementation that 
takes advantage of pointers to store the coefficient matrix 
as a two-dimensional linked list. To each non-zero entry 
corresponds a hst node. Each node in turn points to struc- 
ture which comprises the numerical value of the entry, its 
two-dimensional indices and a pointer to an updating func- 
tion. A linked list makes sequential traversal of a row or 
a column of the matrix efficient; however, random access 
is expensive. To update the matrix at each frequency step, 
we sequentially scan the entire list and call an update func- 
tion by indirection using a pointer stored within each entry 
structure. 

The RDDS circuit matrix is not only sparse, it is also 
symmetric. The SR4RSE package does not exploit this 
structure because the standard elimination process destroys 
symmetry. We note that the Markowitz scheme can be ex- 
tented in a way that preserves symmetry. 

6 RESULTS 

Our optimized wakefield code was used to compute the 
wake envelope of the RDDS structure, using parameters 
provided by SLAC. On a 550 MHz Pentium III (Linux, 
GNU gcc compiler) a complete calculation of the wake 
takes approximately 14 seconds. This represents a gain of 
roughly three orders of magnitude compared to the previ- 
ously reported performance and allows the generation of 
wakes for an entire linac in less than four hours. Output 
from the code is presented in Figures ^ and ^ The results 
are identical to those obtained by the SLAC group. 
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Figure 2: Computed spectral function for the RDDSl struc- 
ture. 
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Figure 3: Computed wake for the RDDSl structure. 
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